Clustering Web Sessions by Sequence Alignment

نویسندگان

  • Weinan Wang
  • Osmar R. Zaïane
چکیده

Clustering means grouping similar objects into groups such that objects within a same group bear similarity to each other while objects in different groups are dissimilar to each other. As an important component of data mining, much research on clustering has been conducted in different disciplines. In the context of web mining, clustering could be used to cluster similar clickstreams to determine learning behaviours in the case of e-learning, or general site access behaviours in e-commerce or other on-line applications. Most of the algorithms presented in the literature to deal with clustering web sessions treat sessions as sets of visited pages within a time period and don’t consider the sequence of the click-strem visitation. This has a significant consequence when comparing similarities between web sessions. We propose in this paper a new algorithm based on sequence alignment to measure similarities between web sessions where sessions are chronologically ordered sequences of page accesses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study and Evaluation of user’s behavior in e-commerce Using Data Mining

Data mining has matured as a field of basic and applied research in computer science. The objective of this dissertation is to evaluate, propose and improve the use of some of the recent approaches, architectures and Web mining techniques (collecting personal information from customers) are the means of utilizing data mining methods to induce and extract useful information from Web information ...

متن کامل

Similarity Matrix Based Session Clustering by Sequence Alignment Using Dynamic Programming

With the rapid increasing popularity of the WWW, Websites are playing a crucial role to convey knowledge to the end users. Every request of Web site or a transaction on the server is stored in a file called server log file. Providing Web administrator with meaningful information about user access behavior (also called click stream data) has become a necessity to improve the quality of Web infor...

متن کامل

Centralized Clustering Method To Increase Accuracy In Ontology Matching Systems

Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...

متن کامل

Web Site Audience Segmentation Using Hybrid Alignment Techniques

We are working on behavioral marketing in the Internet. On one hand we observe the behavior of visitors, and on the other hand we trigger (in real-time) stimulations intended to alter this behavior. Real-time and mass-customization are the two challenges that we have to address. In this paper, we present a hybrid approach for clustering visitor sessions, based on a combination of global and loc...

متن کامل

Learning Web Users Profiles With Relational Clustering Algorithms

In the context of web personalization and dynamic content recommendation, it is crucial to learn typical user profiles. Although there exists several approaches to mine user profiles (such as association rules or sequential patterns extraction), this paper focuses on the application of relational clustering algorithms on web usage data to characterize user access profiles. These methods rely on...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002